Acoustic Images Based on Hartley Two-dimensional Root Cepstrum Analysis for Speech Recognition
نویسندگان
چکیده
It has been shown that detailed information from non-stationary signals such as speech are better represented by an acoustic image, a two dimensional feature representation [10]. Several time frequency representations such as the spectrogram, Wigner-ville and choi-williams distribution have been proposed [6] while the acoustic images based on the two dimensional root cepstrum analysis (TDRC) is a special case [5]. The novel distribution in this paper suggests acoustic images based on Hartley two dimensional root cepstrum (HTDRC) analysis to represent a non-stationary signals such as speech and preserve both magnitude and phase details of the signal simultaneously. Furthermore it has the capability to extract detailed of both static and dynamic features of the signal. Experimental results demonstrate that the acoustic images based on the HTDRC outperforms the TDRC in speech recognition applications. This method increases the recognition accuracy by 8.1%
منابع مشابه
Speaker-Independent Speech Recognition Using Acoustic Images Based On The TDRC
Conventional cepstrums are one-dimensional, however speech characteristics are represented better by an acoustic image, a two-dimensional feature representation. In this paper, acoustic images based on two-dimensional root cepstrum (TDRC) are used as features for speaker-independent speech recognition. The TDRC is a method of feature extraction which has some advantages over other methods. The ...
متن کاملAnalysis of the root-cepstrum for acoustic modeling and fast decoding in speech recognition
Root-cepstral analysis has been proposed previously for speech recognition in car environments [9]. In this paper, we focus on an alternative aspect of Root-cepstrum as it applies to discriminative acoustic modeling and fast speech recognizer decoding. We compare Root-cepstrum to Mel-Frequency cepstrum Coefficients (MFCC) in terms of their noise immunity during modeling and decoding speed. Our ...
متن کاملExperiments on a parametric nonlinear spectral warping for an HMM-based speech recognizer
This paper is concerned with the search for an optimal feature-set for a speech recognition system. A better acoustic feature analysis that suitably enhances the semantic information in a consistent fashion can reduce raw-score (no grammar) error rate sig-niicantly. A simple two-dimensional parameterized feature set is proposed. The feature-set is compared against a standard mel-cepstrum, LPC-b...
متن کاملIntegrated Feature Normalization and Enhancement for robust Speaker Recognition using Acoustic Factor Analysis
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کاملIntegrated Feature Normalization and Enhancement for Robust Speaker Recognition Using Acoustic
State-of-the-art factor analysis based channel compensation methods for speaker recognition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In th...
متن کامل